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ABSTRACT 

For galaxy clustering to provide robust constraints on cosmological parameters and 
galaxy formation models, it is essential to make reliable estimates of the errors on clus- 
tering measurements. We present a new technique, based on a spatial Jackknife (JK) 
resampling, which provides an objective way to estimate errors on clustering statistics. 
Our approach allows us to set the appropriate size for the Jackknife subsamples. The 
method also provides a means to assess the impact of individual regions on the mea- 
sured clustering, and thereby to establish whether or not a given galaxy catalogue is 
dominated by one or several large structures, preventing it to be considered as a "fair 
sample" . We apply this methodology to the two- and three-point correlation functions 
measured from a volume limited sample of M* galaxies drawn from data release seven 
of the Sloan Digital Sky Survey (SDSS). The frequency of jackknife subsample out- 
liers in the data is shown to be consistent with that seen in large N-body simulations 
of clustering in the cosmological constant plus cold dark matter cosmology. We also 
present a comparison of the three-point correlation fimction in SDSS and 2dFGRS 
using this approach and find consistent measurements between the two samples. 

Key words: galaxies: statistics, cosmology: theory, large-scale structure. 



1 INTRODUCTION 

The clustering of galaxies has the potential to place con- 
straints on the values of fundamental cosmological parame- 
ters and to probe the efBciency of galajcy formation in dark 
matter haloes of different masses. A key assumption made 
when interpreting clustering measurements is that the sam- 
ple used is representative of a much larger volume of the 
Universe. The hope is that the survey volume is sufficiently 
large that the clustering measurements made from it should 
agree with the mean of measurements obtained from an en- 
semble of similar volumes (which is, in general, not feasible 
to carry out). This is the "fair sample" hypothesis (Peebles 
1980). 

This "fair sample" hypothesis, commonly invoked in 
large scale structure analyses, is often abused in the litera- 
ture. The hypothesis relies on two conditions: (a) that the 
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clustering statistic for one survey is not biased with respect 
to the mean measurement from an ensemble of independent 
but similar surveys; (b) that the error estimate on the statis- 
tic is properly characterized, i.e. it accounts for the variance 
seen in the ensemble of measurements (that is most often 
not achievable with real data). The latter point is often ig- 
nored and samples are referred to as "unfair" when the clus- 
tering statistic (with its associated errors) is at odds with 
either that from other samples (and their errors) or with 
the expected, theoretical value. The point we want to stress 
in this paper is that most surveys are likely to be "fair" 
but that the associated error analysis is likely to be "un- 
fair" . This problem arises because a simplistic (or computa- 
tionally inexpensive) approach to estimating the errors has 
been implemented and insufficient attention has been given 
to the inherent limitations of the method used. The recent 
extensive use of covariance matrices to account for the full 
error budget has brushed aside the important discussion of 
the limitations of the error method considered (as discussed 
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at length in e.g. Norberg et al. 2009 for 2-point clustering 
statistics). 

Surveys of the local Universe have increased in size 
by several orders of magnitude since the late 1990s, with 
over one million galaxy redshifts measured to date, spanning 
volumes running to several hundreds of cubic megaparsecs 
(York et al. 2000; CoUess et al. 2001, 2003). Nevertheless, 
some clustering analyses have thrown up unusual results 
which suggest that current surveys may not, at least under 
certain conditions, be fair samples of the Universe. These 
anomalous results were first revealed in the form of the hi- 
erarchical amplitudes, the ratio of higher order correlation 
functions to a power of the two-point function, measured 
from volume limited samples drawn from the two-degree 
field galaxy redshift survey (2dFGRS; Baugh et al. 2004; 
Croton et al. 2004; Gaztanaga et al. 2005). The hierarchical 
amplitudes measured from the 2dFGRS displayed an upturn 
on large scales which is not expected in the gravitational 
instability scenario (e.g. Baugh, Gaztanaga & Efstathiou 
1995). Croton et al. (2004) demonstrated that this was due 
to two particularly "hot" cells in the cell count distribu- 
tion, which contained unusually large numbers of galaxies. 
On blanking out these cells from the survey and removing 
them from the count probability distribution, the hierarchi- 
cal amplitudes resembled the theoretical expectations more 
closely. Similar conclusions were reached in separate anal- 
yses of the two-point and three point correlation functions 
measured from the Sloan Digital Sky Survey (SDSS; Zehavi 
et al. 2002, 2004, 2005, 2011; Nichol et al. 2006; McBride et 
al. 2011a). 

One of the "superstructures" identified in the 2dFGRS 
analyses is part of the SDSS "Great Wall" (Gott et al. 2005). 
Clustering measurements from the magnitude limited 2dF- 
GRS do not show any significant change when this region is 
omitted (e.g. Cole et al. 2005). The influence of this structure 
over clustering measurments seems to depend on how the 
sample is constructed, an issue which we investigate further 
in this paper. The two point clustering analyses of Zehavi 
et al. (2002, 2005, 2011) on SDSS have shown exphcitly the 
influence of the SDSS "Great Wall" on their measurments, 
with volume limited samples of M* galaxies being partic- 
ularly affected, whereas samples corresponding to brighter 
galaxy luminosities are less sensitive to the presence of this 
structure. Different volume limited samples will weight the 
structure differently, as it will be traced out by a different 
number of galaxies. Also, the volume of the sample changes 
when the luminosity deflning it is varied; in a larger volume, 
other, similar structures may feature, diluting the impact of 
any single structure on clustering measurements. 

Croton et al. (2004, 2007), as well as Gaztanaga et 
al (2005), presented measurements with and without the 
galaxies in the hot regions identified in their counts in cells 
analysis of the 2dFGRS, with the simple aim of showing 
the contribution of this structure on the hierarchical mo- 
ments. Their approach was specific to the counts in cells 
clustering analysis. In this paper we present an objective 
technique which can be applied to any clustering statistic. 
The new method we describe is a development of the Jack- 
knife technique for error estimation (see e.g. Norberg et al. 
2009 for a review and application of this method and others 
to galaxy clustering error estimations; see also Zehavi et al. 
2002 who gives the first comprehensive description of the 



Jackknife method for clustering statistics, which was used in 
the earlier clustering analysis of Roche et al. 1993 and Croft 
et al. 1999). Our approach can be used to assess whether 
or not a clustering signal is unduly influenced by the galaxy 
distribution in one particular region of a survey. Further- 
more, we discuss a new statistic, the JK ensemble fluctua- 
tion, which provides a robust assessment of such fluctuations 
in the galaxy distribution. We note here that McBride et al. 
(2011a) present a similar analysis for their SDSS 3-point 
function measurement (in particular their Figs. 11, 12, and 
13 are very relevant for the present paper) . The main differ- 
ence between their work and ours is that we take the study 
of the influence of extreme structure significantly further, by 
introducing a new statistic the JK ensemble fluctuation. 

This paper is laid out as follows. In Section[2]we present 
the galaxy survey data used, which is the seventh data re- 
lease of the SDSS, the associated completeness masks, the 
division of the SDSS into Jackknife regions (or JK quilts) 
and the clustering statistics used. Section [3] is devoted to 
our new objective method based on Jackknife resampling 
of the data, which is applied to the M* SDSS sample, and 
shows the impact of large structures or "superstructures" 
on 2 and 3-point statistics in redshift spac^ A new statis- 
tic, the JK ensemble fluctuation, is introduced to quantify 
the importance of outliers. In Section [4] we illustrate the 
performance of this new statistic and show how it success- 
fully identifies unusual regions in SDSS DR7 and sets their 
significance. In Section [5] apply our Jackknife approach to 
ACDM simulations to show that they display similar out- 
liers to those seen in the data. In Section |6] we revist the 
3-pt correlation function analysis of Gaztaiiaga et al. (2005), 
but this time using our new methodology based on the 
JK ensemble fluctuation. A summary is given in Section [7] 
Throughout we assume a standard cosmological model, with 
f^M = 0.25, JIa = 0.75 and a value for the Hubble parameter 
of ft = //o/(100kms-^Mpc-^). 



2 DATA AND METHODOLOGY 

In this section we describe the galaxy data used (§ 2.1), the 
completeness mask of the survey (§ 2.2), the division of the 
survey into zones for Jackknife sampling (§ 2.3), the volume 
limited samples used in the clustering analysis presented in 
subsequent sections (§ 2.4) and the method estimation of 2 
and 3 point correlation functions (§ 2.5). 

2.1 SDSS DR7 Data 

In this paper we use Data Release 7 (DR7) of the Sloan Dig- 
ital Sky Survey (SDSS; Abazajian et al. 2009). DR7 covers 
9380 deg^ and contains 930 000 galaxy redshifts. We select 
galaxies brighter than a Petrosian magnitude of r < 17.65. 
The magnitude limit is slightly brighter than that of the 
canonical main galaxy sample of r = 17.77 (Strauss et al. 
2002). We adopt this cut to avoid having to model a varying 
magnitude limit, which is needed to include all of the early 

^ This paper considers only analyses in redshift space, even 
though our method is perfectly valid for other clustering statistics 
in both real and redshift space. 



SAGS-IV: An objective way to quantify the impact of LSS 3 




080 0.90 1.00 
I I I I I I I 1 I I I I I 1 I I I I 

185 190 195 £00 

a[deg] 

Figure 1. A zoomed in view of the SDSS redshift completeness 
mask. The colour bar indicates the redshift completeness fraction 
which ranges from 0.8 to 1. Sectors (i.e. uniquely defined regions 
sampled by the spectroscopic tiles) with completeness below 0.8 
are shaded in yellow and regions outside the survey area in white. 

SDSS data, since the magnitudes have been revised slightly 
since the early data were taken. In addition we also impose 
a bright magnitude limit of r = 15, the point at which SDSS 
galaxy magnitudes start to become less reliable. The sam- 
ple we consider contains 513k high quality, unique galaxy 
redshifts with a median redshift of Zmcd — 0.10. The precise 
choice of magnitude limit does not have an impact on our 
results. In our analysis we use SDSS Petrosian magnitudes 
calibrated using the prescription of Tucker et al. (2006) and 
corrected for Galactic extinction. 



2.2 SDSS survey mask 

In order to make robust and reliable estimates of clustering 
the incompleteness in the spectroscopic catalogue needs to 
be taken into account. We do this using a redshift complete- 
ness mask. The variation in the survey completeness on the 
sky is then used to modulate the density of unclustered or 
random points used in the estimation of the mean density 
in clustering analyses. 

We start by constructing a mask from the individual 
SDSS imaging scans. The imaging mask is pixelized us- 
ing an equal-area projection, with a typical pixel area of 
~ 5.3 arcmin^. This is more than sufficient given the aver- 
age number density of SDSS targets of ~ 90 galaxies deg"'^ 
and the typical scales we are interested in (s > l/i~^Mpc 
at 2: > 0.04). Bad pixels, as labelled by the SDSS pipeline, 
are omitted from the imaging mask. Next, using the position 
of the spectroscopic tiles, we create the spectroscopic com- 
pleteness mask, following the method devised for the 2dF- 
GRS survey (CoUess et al. 2001; Norberg et al. 2002; Cole 
et al. 2005). The survey is divided up into sectors which are 
regions defined by the unique overlap of spectroscopic tiles. 
Any sector which contains fewer than 10 galaxies is merged 



with its most populous neighbouring sector. This ensures 
that the sector completeness is not affected by shot-noise, 
which would lead to a patchy redshift completeness mask. 
The redshift completeness in a sector is the ratio of galax- 
ies with a measured (high quality) spectroscopic redshift di- 
vided by the total number of galaxies in the target catalogue 
in that sector. The CasJobs SQL queries and data files used 
to retrieve the information required to construct the SDSS 
survey masks (imaging and spectroscopic) are given in Ap- 
pendix [X] 

We show a small part of the survey mask we have con- 
structed for the SDSS DR7 in Fig. [T] Regions outside the 
survey boundary are shaded white. Sectors which are less 
than 80% complete but which lie within the survey bound- 
ary are shaded yellow. The remaining sectors have a spec- 
troscopic completeness ranging between 80 and 100%. It is 
these latter regions that are retained in our subsequent clus- 
tering analysis. 

2.3 SDSS Jackknife quilts 

Our goal in this paper is to assess the impact of unusual 
structures on clustering statistics. We do this by dividing 
the SDSS survey up into different regions to assess their 
contribution to the clustering signal. A framework in which 
to do this analysis is provided by the Jackknife approach 
to error estimation (see e.g. Norberg et al. 2009). In this 
method, a dataset is divided into N^ub zones. The clustering 
is then measured from a sample made up of A'^sub — 1 of these 
zones, leaving one of the zones out. This process is repeated 
Nsub times, leaving out each one of the zones in turn. The 
scatter between the clustering measurements from the Jack- 
knife samples is used to estimate the error on the clustering 
statistic. For completeness we quote here the standard rela- 
tion used to estimate the Jackknife covariance matrix (see 
e.g. Norberg et al. 2009, Zehavi et al. 2002) when split into 
Naub samples: 

Ci-k{xi,Xj) = ^^^^ ^Y^{Xi-Xi){x']-Xj), (1) 

^^'^"'^ fe=l 

where Xi is the i^^ measure of the statistic of interest. It is 
assumed that the mean expectation value is given by 

x^ = ^4/Nsub, (2) 

but it can also be given simply by the measurement over the 
whole sample. Note the factor of Nsub — 1 which appears in 
Eq. [l](Tukey 1958; Miller 1974). Qualitatively, this factor 
takes into account the lack of independence between the 
Naub copies or resamplings of the data; recall that from one 
copy to the next, only two sub- volumes are different (or 
equivalently Naub — 2 sub- volumes are the same). 

Here we divide the SDSS survey into zones in right as- 
cension and declination, producing what we call the sur- 
vey "quilt". The zones have equal areas after taking into 
account their spectroscopic completeness, using the com- 
pleteness mask described above. We use square 5x5 (i.e. 
Aaub ~ 25 Jackknife zones) and 15x15 (i.e. Aaub = 225 Jack- 
knife zones) Jackknife quilt patterns in the right ascension 
(a) and declination (5) plane, leading to the quilt patch- 
works shown in Fig. [2] We also investigated Jackknife quilts 
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Figure 2. Left: The SDSS survey divided into 5x5 Jackknife regions in the a — 5 plane. The 25 zones in the resulting crazy quilt 
have equal area after taking into account their spectroscopic completeness. Right: Same as the left-hand panel, but for the case of 225 
Jackknife zones corresponding to a square 15x15 Jackknife quilt. 



Table 1. A summary of the characteristics of the volume limited samples considered in this paper. Col. 1 - volume label ; col. 2 - 
absolute magnitude range ; col. 3 and 4 - minimum and maximum redshift limits respectively; col. 5 - sample volume; col. 6 - number 
of galaxies with redshift. Note that M* is defined as in Blanton et al. (2003b), i.e. M* — Slogj^p/i = —20.44. The first three entries in the 
table correspond to approximately the same volume. 



Label 


M-range 




Zmax 


V/Zi-^Mpc^ 




ref 


M*+0.5 <M<M*-0.5 


0.0508 


0.1065 


(258.3)3 


98 317 


faint -ref 


M*+0.5 <M<M* 


0.0405 


0.1065 


(263.5)3 


63 916 


bright-ref 


M*<M<M*-0.5 


0.0508 


0.1065 


(258.3)3 


37 914 


bright-all 


M*<M<M*-0.5 


0.0508 


0.1325 


(325.9)3 


76 332 


bright-far 


M*<M<M*-0.5 


0.1065 


0.1325 


(259.0)3 


38 418 



with other dimensions (Nsub= 16, 49, and 100); our results 
are robust to the number of Jackknife zones considered. 

For the SDSS footprint considered in this paper, ~ 
6250 deg^, each zone in the 25 Jackknife quilt covers a com- 
pleteness weighted area of precisely 250 deg^, while the aver- 
age area subtended by a zone is 279.7±3.3 deg^ The corre- 
sponding numbers for the 225 Jackknife quilt are 27.7 deg'^ 
and 31.1 ± 0.6 deg^. Because of the high spectroscopic com- 
pleteness threshold considered here and the uniform tiling 
of the SDSS survey the relative rms variance on the area 
subtended by each Jackknife zone is less than 2 per cent. 



2.4 Volume limited samples 

In this paper we focus our attention on galaxies around M*, 
which in the SDSS r-band corresponds to M*— 5 logj^Q /i = 
—20.44 (Blanton et al. 2003b). In order to obtain an abso- 
lute magnitude for each galaxy, and also to construct volume 
limited samples, we need to adopt a fc -|- e-correction. We ap- 
ply a global k + e-correction to a nominal reference redshift 
-Zrof = 0.1 following Blanton et al. (2003a,b). A volume lim- 



ited sample is defined by the apparent magnitude range of 
the survey and a bin in luminosity: a galaxy in a given lumi- 
nosity bin would fall within the apparent magnitude range 
of the survey if placed at any redshift between the limits 
Zmin and Zmax- Here we consider a variety of volumes and 
luminosity bins close to M*, and their basic properties are 
listed in Table [l] A brief description of each sample is as 
follows: 



• ref - the volume limited sample for galaxies in a one 
magnitude wide bin centred on M*. 

• bright-ref - the bright half magnitude bin of the ref 
sample. 

• f aint-ref - the volume limited sample for galaxies in 
the half-magnitude bin fainter than M*: this volume is ~ 6 
per cent larger than the one of the ref sample. 

• bright-all - the volume limited sample for galaxies in 
the half-magnitude bin brighter than M*. By construction 
this sample extends to a higher redshift than the bright-ref 
sample. 
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Figure 3. Top: The angular galaxy distribution of the ref volume limited sample (dots). Bottom: A zoomed in view centered on zone 
23. Ovcrplotted are the boundaries of the zones for the 5x5 and 15x15 Jackknife quilts, shown in thick red and thin blue linestyles 
respectively. The subzone numbers (colour coded according to the dimension of the quilt they belong to) are indicated for a limited 
number of patches, some of which will be referred to in later sections. The subzone number increases from left to right from top to 
bottom. 



• bright-far - the high rcdshift half of the bright-all 
volume. The bright-ref and bright-far samples combined 
give the bright-all sample. 

In summary, the ref and bright-ref volumes are identical; 
the ref and f aint-ref volumes differ by only ~ 6 per cent; 
the ref and bright-far volumes are fully disjoint but cover 
(to within less than one per cent) an equal volume of space, 
i.e. ~ 0.017^"^ Gpc^. The CasJobs SQL query used to gen- 



erate the input catalogue from which these volume limited 
samples are constructed is given in Appendix [A] 

In Fig. [s] we show the full (top) and zoomed in view 
(bottom) of the angular galaxy distribution of the ref vol- 
ume limited sample viewed through the completeness mask, 
together with the boundaries of the zones of the 5x5 and 
15x15 Jackknife quilts (thick red and thin blue lines respec- 
tively). Big empty patches, like the one at a ~ 175deg. and 
5 ~ 6deg., corresponds to areas masked by the survey com- 



6 P. Norberg, E. Gaztanaga, CM. Baugh & D.J. Croton 




- 



-10 




-20 



logio s / h-'Mpc 



SDSS Reference 

J , , , , \ , , , , \ 1 

50 100 



150 



Figure 4. The distribution of clustering measurements from the Jackknife resamplings of the ref sample. The left-hand panel shows 
the measurements for ^(s) and the right-hand panel shows Q3(«), with a in degrees. In both cases, we plot the ratio of the correlation 
function measured from each Jackknife resampling to that measured from the full ref sample. The lowest amplitude correlation functions 
are labelled by the zone number omitted in the resampling. The larger (blue) errorbars show the rms error estimated from a Jackknife 
resampling of all of the zones, which corresponds to the central 68% of the distribution of measurements. In the left hand panel we also 
show the slightly smaller (red) errorbars, corresponding to the rms error estimated on omitting zone 23 from the Jackknife resampling 
altogether. Note that these errorbars are smaller by a factor VNsub than those assigned by Jackknife errors to correlation function 
measurements (see e.g. Eq. [TJ. 



pleteness mask. This plot shows clear evidence of large co- 
herent galaxy structures, spanning several zones: this is pre- 
cisely the type of large scale structures whose influence on 
clustering statistics we aim to investigate in the paper. 

2.5 Clustering estimators 

In this paper we consider the spherically averaged 2-pt 
correlation function, ^(s), and the reduced 3-point corre- 
lation function, Q3(a). The clustering estimators used are 
described in detail in Norberg et al. (2009) and Gaztafiaga 
et al. (2005), for ^(s) and Q3(q:) respectively. 

In the case of ^(s), we count all pairs out to a maxi- 
mum separation in redshift space of 60 Mpc. For Q3(q:) 
we consider triplets in which one side of the triangle is 
8/i~^Mpc and the other is 16/i~^Mpc, with an opening 
angle of a between these two sides. This is one of the many 
configurations considered in Gaztafiaga et al. (2005) (see 
their Fig. 1 for an illustration of the triplet). We have fo- 
cused on a minimum of 8 Mpc scales for SDSS (as op- 
posed to Qh'^Mpc in 2dFGRS) because the larger SDSS 
volume makes it possible to explore larger (weakly non- 
linear) scales. We follow the implementation of the Jack- 
knife method outlined in Norberg et al. (2009), and briefly 
summarized in subsection 12.31 

We estimate ^(s) and Q3(a) for all of the volume lim- 
ited samples listed in Table [l] as well as for the Jackknife 
resamplings of the different dimension quilts considered in 
this paper, i.e. mainly Naub= 25 and 225, but also Naub = 
16, 49, and 100. 



3 IDENTIFICATION OF OUTLYING REGIONS 

We use a Jackknife technique to identify survey regions 
which have a big influence on the measured clustering signal. 
The method consists of comparing clustering measurements 
from each of the Jackknife resamplings of the data with that 
obtained from the full dataset. If the measurement from a 
particular resampling is unusual next to the measurements 
obtained from the others, then the omitted zone is referred 
to as an outlying zone. The key here is the deflnition of "un- 
usual" . The very presence of an outlying zone affects the dis- 
tribution of measurements from the Jackknife resamplings, 
with the consequence that simple concepts which apply to 
Gaussian distributions, such as the mean and variance, need 
to be redefined. To deal with this we introduce two new 
statistics to quantify outliers: the JK resampling fluctuation 
and the JK ensemble fluctuation. 

In this section we apply the Jackknife resampling to the 
ref sample listed in Table [T] We consider the clustering in 
the other samples listed in Table [T] in Section |4] 



3.1 The JK resampling fluctuation 

In Fig. |4] we plot the two and three point correlation func- 
tions measured from each of the Jackknife resamplings of 
the ref sample. We show the distribution of these measure- 
ments by plotting the scaled quantity Ai which is defined 
as 

A. = ^ - 1, (3) 



SAGS-IV: An objective way to quantify the impact of LSS 7 



-2 



SDSS Reference f(s) 



2 - 



- 



^ -2 - 



_ Q3 SDSS Kcfcrcnce 



10 15 
subzone number 



20 



25 



10 15 
subzone rmrribei' 



20 



25 



^JK defined in Eq. |6]l using pair separations of 12 — 14/t~^ Mpc in (loft panel) and 
triplets separated by similar scales for Q3{a) (right panel). Each point corresponds to leaving out the zone number plotted on the x-axis. 
This new statistic is shown in both panels for a Nsub=25 Jackknife quilt, as shown on the left panel of Fig. [2] The fluctuation is plotted 
using a cross if its value lies within ±3; otherwise the point is labelled with the zone number that was omitted. 



where Xi represents a clustering statistic, which in the 
present paper wiU be either the two or three point function 
measured from a JK resamphng of the data on omitting zone 
i and a;fuu indicates the corresponding measurement using 
the fuU sample. We call this quantity the JK resampling 
fluctuation as it quantifies the deviation of the clustering 
measured on omitting a particular zone from the measure- 
ment using the full dataset. 

In Fig. [4] we use 25 Jackknife zones. The left panel of 
Fig. [4] shows the scatter in the two-point correlation func- 
tion measured from the different resamplings and the right 
hand panel shows the scatter in Q3(ci). The measurements 
of ^(s) and Q3(ci) when omitting zone 23 (which is indicated 
in Fig. |3| are clear outliers compared to the measurements 
from the other jackknife resamplings. Both of these mea- 
surements lie well outside the variance of the resamplings 
which is indicated in the plot by the error bars. Note that 
the variance here corresponds to the range which encloses 
the central 68% of the JK measurements. Of course, when 
interpreting this plot it is important to remember that bins 
of pair separation or opening angle are correlated. Further- 
more the measurements from different resamplings are also 
correlated, as they differ by only 8% in area and hence by 
the same amount in volume when using 25 Jackknife zones. 

3.2 The JK ensemble fluctuation 

The fluctuation in the clustering signal on omitting zone 23 
is in fact even more significant than suggested by Fig. |4] 
To demonstrate this we now apply the Jackknife technique 
to the data in a slightly different way from the standard 
approach as outlined in Norberg et al (2009). The relative 
rms error estimated in the standard Jackknife scheme by 
splitting the data into Nsub zones is denoted by atot: 



2 



(4) 



where Ai is the JK fluctuation defined by Eq. [sj^jWe then 
make a similar error estimate from a new dataset, which is 
comprised of Ngub — 1 zones, omitting the i**^ zone from the 
set altogether: 



1 



1 T. 



(5) 



The Jackknife resampling is done in this instance sampling 
only from the Ngub — 1 zones, without considering the i"^ 
zone at all. The rms error obtained in this way is written as 

In order to quantify how the scatter in the clustering 
depends upon which zone is left out, we introduce a new 

defined as: 



(6) 



O"tot- 



which is just the JK resampling fiuctuation for the i* zone 
normalized to its corresponding rms error, i.e. (Jtot-i. 

The probability of finding a particular value of the 
JK ensemble fluctuation is plotted in Fig. |5] for a distri- 
bution which is very similar to that expected for the SDSS 
data, derived from an ensemble of N-body simulations de- 
scribed in Section [5] In Fig.|5](5jK is estimated for f(s) over 

^ Note that we do not multiply here by A^^ub — 1, as would be 
necessary to scale this variance to the one corresponding to the 
full sample (Norberg et al. 2009). Here we are interested instead 
in how the Jackknife errors fluctuate from one resampling to an- 
other. 
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Figure 5. The probability of finding a JK ensemble fluctua- 
tion, <5jK, below some value for the distribution of clustering mea- 
surements from an ensemble of SDSS mocks made from N-body 
simulations (see Section [sjl for three values of Ngui, (solid lines). 
The <5jK distribution is estimated here for ^(s) with Nsub=27, 
64 and 125 (black, red and green respectively) over the range 
12 — 14h~^ Mpc. The corresponding Gaussian distributions (same 
colours, but dotted lines) have the same median value and a vari- 
ance which encloses 68% of the simulation measurements. The 
JK ensemble fluctuation for small number of JK zones (Nsui3=27) 
present a non negligible non-Gaussian tail, highlighting the im- 
portance of using N-body simulations to estimate its cumulative 
probability distribution function, while for larger Ngub values the 
distribution can be well approximated by a Gaussian. 



12 - 14/1"^ Mpc, the ranee of scales most relevant for this 
paper. For comparison purposes, we plot a Gaussian that 
has the same median value as that obtained from the sim- 
ulations and has a variance which corresponds to the range 
which encloses 68% of the correlation functions measured 
from the simulations. These distributions describe the ex- 
pected range of fluctuations in the new JK ensemble fluc- 
tuation. For small values of 5]jj^ the two distributions are 
similar, due to the way the Gaussian has been set up. How- 
ever, they differ appreciably in the tails. In the case of the 
Gaussian, a value of S^jn < —1.5 would be obtained 3% 
of the time, whereas for the distribution from the simula- 
tions, we would expect to see such a value in around 8% of 
cases; the 3% chance in the simulation case corresponds to 
Sjf^ < —2.5. The non-Gaussian nature of the distribution 
of the JK ensemble fluctuation decreases with the number 
of JK samples considered, and approaches asymptotically 
the corresponding Gaussian distribution for 5jk- Therefore 
for small (large) number of JK samples, one has to consider 
the non-Gaussian (Gaussian) distribution to estimate the 
signiflcance level of a given . 

In Fig.|6]we use the new statistic, Sjx, deflned by Eq.joj 
to quantify the significance of the fluctuations in the clus- 
tering measurements. The error is calculated using pair sep- 



arations in the range 12 - 14/i-^Mpc0 In the left hand 
panel we plot JK ensemble fluctuation for ^(s) on splitting 
the data into 25 Jackknife zones and in the right we show 
the corresponding result for Q3(q), sampling similar scales. 
Fig. |6] (left-hand panel) shows that if the Jackknife zones 
were truly independent, then the fluctuation in the rms er- 
ror on leaving out zone 23 corresponds to < —3, which, 
according to Fig. [5] should occur around 1.6% of the time. 
For a Gaussian distribution, this large value of Sj^ is even 
less likely. We note that this probability comes from the frac- 
tion of JK regions from an ensemble of simulations that have 
SjK < —3, as 5}K is estimated for each JK region. In other 
words, the constraint is on the probability of a JK region be- 
ing this (or more) extreme and not on the probability that 
a survey contains such an extreme region. From the simula- 
tions we infer that ~ 50 percent of SDSS ref like volumes 
would contain one or more JK regions with Sjk < —3. For 
Qs{a) (right-hand panel of Fig.[6]l the fluctuation in the rms 
error on leaving out zone 23 corresponds to (5}k ~ ~6, which 
translates into a probability of less than ~ 0.4% (see Fig.[5|, 
assuming the distribution of the JK ensemble fluctuation is 
the same for ^(s) and Q-i{a). It should be noted that the 
Sjk distribution should be dependent on the statistic con- 
sidered, as well as on the number of zones considered. In our 
comparisons we always find that outliers in Qaia) tend to 
correspond to the ones in ^(s). Hence we assume here that 
the distribution of Sjk is similar for the two and three point 
correlation functions. 

For both the two and three point correlation functions, 
the measurements on omitting zone 23 are significant out- 
liers, not only visually as in Fig.[4]but also statistically when 
interpreting their Sjk value (less than —3 for both ^(s) and 
Qs(a)) in terms of the global Sjk distribution as extracted 
from the simulations. It is not surprising to see that the sig- 
nificance level is much larger for the higher order statistic 
(i.e. Q3(q)), as it is well known that such statistics are much 
more sensitive to large scale fluctuations than the two point 
function. We note that zone 22 is also an outlier for the re- 
duced 3-point statistic, albeit to a much lesser extent than 
zone 23. This is only really noticeable with this new statis- 
tic, as in the right hand panel of Fig. |4] the first impression 
one has is that zones 22 and 23 are both outliers to roughly 
a similar extent. 

Fig. [T] presents the JK ensemble fluctuation for the 
Naub=225 quilt. Zones 187 and 217 to 220 stand out as 
much as zone 23 did for the Nsub=25 quilt. These zones 
from the 225 Jackknife quilt are mostly co-spatial and over- 
lap mainly with zones 22 and 23 (and a small amount with 
zone 24) from the 25 Jackknife quilt, as shown more ex- 
plicitly in Fig. [3] Together they map out the central parts 
of the SDSS great wall (Gott et al. 2005), a feature uncov- 
ered partially in the earlier 2dFGRS analysis of Baugh et al. 
(2004). Out of the seven outliers in the 225 Jackknife quilt, 
five belong to this structure. 



Any scale could be considered for this statistic, but we decide 
here to focus on one range appropriate for both statistics; with 
Q3(a) sampling scales between 8h~^ Mpc and 24h~^ Mpc, it is 
natural to settle on the range 12 - 14/1-1 ^^pj,^ 
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Figure 7. The same plot as in tiie left panel of Fig. |6] but for 
Nsub=225, i.e. for the 15x15 Jackknife quilt, as shown in the right 
panel of Fig. [2] For clarity, crosses have been omitted whenever 
|5jk| > 2.5. There are 7 zones which present significant deviations 
from the expectation in the JK ensemble fluctuation statistic (as 
defined in Eq. |6| using pair separations of 12 — 14 Mpc. For 
this large Ngu^, the |(5jk| > 2.5 threshold corresponds to a 2.5-cr 
level, implying that the Gaussian expectation for the number of 
outliers is just ^ 1.4 it 1.2, significantly lower than the number of 
observed outliers. 



3.3 Blind search and number of JK 

The variant on the Jackknife technique we have described in 
this section is an objective way to find unusual structures in 
the galaxy distribution. The Jackknife zones provide a blind 
test to detect superstructures based on uncovering outliers 
in clustering measurements, although it is unlikely that this 
method would be the preferred one for actually detecting 
superstructures. It is certainly more valuable as test of the 
homogeneity of the JK subsampling chosen. Furthermore, 
we have introduced a statistic which allows us to quantify 
the significance of the outliers. Another feature of this ap- 
proach is that it can be used to suggest the appropriate size 
of Jackknife zone to use in error estimation. If, for example, 
we find that several adjacent zones give outlying cluster- 
ing results, as is the case for the 225 Jackknife quilt above, 
then the analysis should be repeated with larger zones. In 
this way, either the outliers disappear or the unusual clus- 
tering is restricted to one zone, as we found for the 25 quilt 
Jackknife. With one outlier zone, one could then present the 
clustering analysis both with and without this zone to show 
its impact on the results (as was done in our earlier higher- 
order clustering analyses of the 2dFGRS, e.g. Baugh et al. 
2004; Croton et al. 2004, 2007; Gaztanaga et al. 2005). Sim- 
ilarly, Zehavi et al. (2005, 2011) presented SDSS resuhs for 
a full M* galaxy sample and for a sample with a redshift 
limit designed to exclude the Sloan Great Wall. 



3.4 Impact on errors 

Another consequence of the analysis in this section is the 
impact of zones containing unusual structures on the error 
estimated using the Jackknife method. Fig. [4] shows that 
the Jackknife resampling which produces an outlying clus- 
ter statistic can lead to a over estimate of the variance. The 
blue errorbars in Fig. [4] show the variance estimated using 
all of the Jackknife resamplings, including the outlier. The 
red errorbars show the variance estimated from a Jackknife 
resampling of 24 zones in the 25 patch quilt, leaving out zone 
23 altogether. For the two point function, there is a modest 
reduction in the variance on leaving out the zone containing 
the superstructure, but more importantly there is a signifi- 
cant uncertainty in the corresponding Jackknife covariance 
matrix when one sample is the source of most of the vari- 
ance. This becomes clear in Fig. [6] for both ^(s) and Q-i{a), 
where one notices a systematic bias in the scatter in the 
JK ensemble fiuctuation. If all samples were equally impor- 
tant for the variance, this new statistic would be distributed 
symmetrically around the expectation value. This is not the 
case for either of the clustering statistics considered here. 



4 THE INFLUENCE OF SUPERSTRUCTURES 

Once we have identified a superstructure using the Jack- 
knife approach set out in the previous section, we can study 
its infiuence on the clustering measured from other volume 
limited samples drawn from the survey. Croton et al. (2005) 
found that the 2dFGRS superstructures had an impact only 
on the M* volume limited sample, with Zehavi et al. (2005) 
reaching almost the same conclusion from their SDSS anal- 
ysis. In the case of the sample one magnitude brighter than 
M*, the clustering measurements did not show any unusual 
features, even though the superstructure was also included 
within the volume. Two things change when moving from 
the M* volume limited sample to a sample that is one mag- 
nitude brighter: the volume of the sample increases and the 
way in which structures are represented changes. Impor- 
tantly, the number of galaxies tracing the superstructure 
is different between the two volume limited samples, and 
so the contrast of the superstructure will also be different. 
Also, a larger volume could result in other superstructures 
being included, diluting the impact of any one structure on 
the measured clustering. 

In an attempt to disentangle and understand these ef- 
fects (namely, the increase in the volume used to measure 
clustering and the representation of structures with different 
numbers of galaxies) we compare clustering measurements 
in a range of volume limited samples (which are listed in Ta- 
ble [l]). AU our comparisons use relative clustering statistics 
for a given sample, allowing us to properly compare different 
galaxies without the need to model tracer dependent biases, 
such as luminosity or colour dependent clustering. 

First, we look at different tracers within the same vol- 
ume, to isolate the impact of the sampling of the superstruc- 
ture or its contrast relative to the other structures within 
the volume. In Fig. [S] we compare the clustering measured 
from Jackknife resamplings of the three samples: the ref 
(one magnitude bin centred on M*), the bright-ref (the 
half magnitude bin brighter than M*) and faint-ref (the 
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Figure 8. Clustering statistics measured using Jackknife resamplings of galaxies from the same volume, using the ref sample (black), 
the bright-ref (green) and faint-ref (red). The left-hand panel shows ^(s) and the right-hand panel Q3(a). The most significantly 
outlying clustering statistics are labelled by the omitted zone number (for clarity we place labels both on the center and on the right of 
each line for Q3). The errorbars correspond to the rms clustering error from the Jackknife resamplings of the ref sample. 



half magnitude bin fainter than M*) samples. These galaxy 
samples are constrained to cover approximately the same 
volume and hence are subject to the same underlying den- 
sity fluctuations. The only difference between them is the 
number of galaxies which populate the structures within the 
volume. Fig.[8]shows that the identity of the outlying zone is 
not sensitive to which galaxies trace out the superstructure 
within the same volume. The left hand panel of Fig.[8]shows 
that zone number 23 from the 5x5 Jackknife quilt is a clear 
outlier for all three samples. The right hand panel of Fig. [S] 
shows the same comparison between samples for Q3(a). For 
this statistic, the omission of two zones, either number 22 
or 23, leads to measurements which stand out for all three 
samples. Hence, the influence of the superstructure is not 
affected by the number of galaxies used to trace it within a 
fixed volume, nor by the tracer (to within the limitation of 
the comparison done here, with galaxies spanning just one 
magnitude in brightness). 

To put our conclusions on a quantitative footing, we 
show in Fig.|9]the corresponding JK ensemble fluctuation of 
^(s) for the three samples discussed in Fig. [s] As expected, 
the JK ensemble fluctuation computed for ^(s) is virtually 
indistinguishable for the three samples extracted from the 
same M* volume. Very similar results are found for Q3. 

The second test we carry out is to compare clustering 
measurements made from different volumes. Here we con- 
sider bright galaxies (i.e. the half magnitude bin brighter 
than M*) as this allows us to cover a wider redshift inter- 
val. Again we compare results obtained from three of the 
samples listed in Table [l] the bright-all, the bright-ref 
(the low redshift half of bright-all) and the bright-far 
(the high redshift half of bright-all) samples. The results 
for the clustering statistics of the Jackknife resamplings of 
these galaxy samples are shown in Fig. |10| We have labelled 
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Figure 9. The JK ensemble fluctuation for ^(s) measured from 
the ref (black), faint-ref (red) and bright-ref (green) volume 
limited samples for Nsub= 25. 

the zones which were identified as outliers in the ref sample 
on Fig. |10[ Leaving out zone 23 produces an outlier in the 
measurements made using the bright-ref sample, as we ob- 
served in Fig.|8]already. For the bright-all and bright-far 
samples, zone 23 is no longer an outlier. The removal of other 
Jackknife zones leads to larger changes in the measured cor- 
relation function. However, the outliers in these cases are 
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Figure 10. Clustering statistics measured using Jackknife resamplings of brigiit galaxies from the bright-all (black), bright-far 
(red) and bright-ref (green) samples (see TableQ. The left-hand panel shows ^(s) and the right-hand panel shows Q3(a). Outlying 
correlation functions that were identified in the ref sample are labelled by the omitted zone number. The errorbars correspond to the 
rms error from the Jackknife resamplings of the bright-all sample. 



not as dramatic as they were in the case of the ref sample. 
The right-hand panel of Fig. [To] shows the same statistic for 
Q3(a). For the bright-ref and bright-all samples, zone 
numbers 22 and 23 are outliers, but again here not as dra- 
matic in bright-far sample. A different zone is an outlier 
for the bright-far sample, as happened for f (s). 

Again, we quantify the above conclusions by comput- 
ing the JK ensemble fluctuation. We show in Fig. [Tl] the 
JK ensemble fluctuation of ^(s) for the three samples dis- 
cussed in Fig. |10[ As expected, zone 23 is only an outlier 
in terms of this statistic for the ref sample, while in the 
two other samples, i.e. bright-all and bright-far, the JK 
ensemble fluctuation statistic for ^(s) points to less extreme 
results for the zones, with bright-all being the most "uni- 
form" of the three samples considered: this is due to its vol- 
ume being twice that of the two other samples. Finally the 
most extreme zone of bright-all is a zone which in both 
bright-ref and bright-far corresponds to S^j^ ~ —2.0 in 
the JK ensemble fluctuation statistic. Very similar results 
are found for Q3. 
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As the zones we use are defined in angle, they cover 
a wide baseline in redshift and many individual structures 
could contribute to what we are calling a superstructure. 
Also, by increasing the volume, other structures could be 
sampled which could have a similar impact on the clustering 
signal. The comparison of the scatter of the Jackknife resam- 
plings in the bright sampling volume shows that the omission 
of different zones can lead to outlying clustering measure- 
ments. However, the departure for a given sub- volume from 
the rest of population seems to become less significant the 
larger the volume. 



Figure 11. The JK ensemble statistic for g measured using 
Nsub=25 for the bright-all, bright-far, bright-ref volume 
limited samples, plotted in black, red and green respectively. As 
expected, samples with the same type of galaxies but covering 
different volumes do not have the same JK ensemble fluctuation 
but more importantly, the larger the volume the less influential 
any given zone becomes. 

4.1 JK ensemble: a practical recipe 

With the JK ensemble fluctuation statistic one can assess 
whether a sample has been split into the right number of 
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sub-zones for a Jackknife error analysis to be valid, as op- 
posed to split into too many small JK regions. The way to 
proceed is by taking the following steps: 

a. the size and number of JK regions should be such that 
there is no "apparent clustering" in the 5jk statistic of neigh- 
bouring zones (i.e. neighbouring zones should have 5jk val- 
ues which are independent from eachother). 

b. no JK region should present too extreme a value for (5jk 
statistic compared to the others, i.e. the associated proba- 
bility of its occurence (as derived from the 5jk distribution 
as shown in Fig. 5 for ^(s) with three values of Nsub) should 
not be significantly at odds with the probability of such an 
event actually happening (which for large Ngub values can 
be modelled by a Gaussian process with Ngub elements). 

c. The probability threshold recommended by the present 
work is ~ 3% (i.e. about 2-a if Gaussian distributed). This 
corresponds to (5jk = —2.5 for ^(s) with Nsub=27, and 5jk — 
—2 for larger Nsub values (according to Fig. 5). 

It is not always possible to find an appropriate number of JK 
subsamples into which the survey should be split which satis- 
fies both conditions (a) and (b), unless one is reduced to use 
far too few subsamples from which a statistical inference can 
be made. In the limit Nsub=l, both conditions are automat- 
ically satisfied, but no errors can be inferred. For example 
in our analysis of the SDSS M* sample (ref sample in Ta- 
ble [T]), Nsub =25 is not appropriate, but at least better than 
Nsub=225. The JK ensemble fluctuation statistic provides a 
quantitative measure of the limitation of the error analysis 
and an indication of the need to proceed with further checks 
before statistically interpreting the results. For example, do 
the outliers agree with the expectations from simulations? In 
summary, when conditions (a) and (b) cannot be satisfied, 
we should use statistical inference with greater care than 
simply assuming that a comprehensive (but inappropriate) 
error analysis has solved the problem entierly. 



4.2 Should one ignore the outliers in the analysis? 

An interesting question to ask is: "Is it better to ignore 
an outlying region altogether when estimating a clustering 
statistic and its associated error?" From a Bayesian point of 
view, such a "massaging" of the data set is clearly unattrac- 
tive, but because it has regularly happened in the past (e.g. 
Zehavi et al. 2002, 2005, 2011 for 2-pt clustering statistics of 
SDSS M* samples) we investigate here what consequences 
this can have. 

A proper analysis to answer the above question would 
require the recalculation of all clustering statistics and dis- 
carding from the start all JK regions identified as outliers. 
Indeed, the way our JK ensemble fluctuation works is such 
that if JK region i is an outlier, then all other JK estimates 
contain region i. Hence discarding JK region i requires the 
re-estimation of the clustering signal using only the remain- 
ing Nsub-1 JK regions. Such a calculation is unfortunately 
beyond the scope of this papei[^ so instead we propose the 
following two tests: 



* The CPU time required to do the full analysis is close to equiv- 
alent to what was required in Norberg et al. (2009). 
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Figure 12. Top: The percentage deviation in the 2-pt correla- 
tion function over the N-body ensemble of realisations. The mean 
result for all the JK resamplings is shown in black, while the red 
curve shows the result of considering only the outlying regions, 
which are defined as those for which |i5jk| > 2.5. Errorbars show 
the error on the mean. Bottom: The percentage difference in the 
relative error on the 2-pt correlation function compared to the N- 
body ensemble of realisations. As in the top panel, the black line 
shows the mean result for the ensemble of JK resamplings, while 
the blue curve presents the result of estimating the 2-pt cluster- 
ing error after discarding the outlying JK region from the error 
analysis, with errorbars showing the error on the mean. The top 
panel shows that discarding the outlying regions affects system- 
atically the overall clustering amplitude on the scales of interest 
(from ~ 2 to ~ 10 per cent between 3 and 30/i~^Mpc). The 
bottom panel shows that ignoring the outliers just for the error 
analysis introduces a systematic underestimate of ^ 10 to ^ 20 
per cent on those same scales, while the ensemble of JK resam- 
plings reproduces fairly accurately the results from the ensemble 
of N-body simulations (to within a few percent). Further details 
are given in §4.2[ 



a. the mean clustering signal from outlying regions should 
be comparable to the mean clustering signal from the ensem- 
ble of simulations. If not, then ignoring the outlying regions 
will influence the overall amplitude of the clustering signal, 
which clearly is unwanted. 

b. if one cannot ignore the outlying JK regions in the esti- 
mate of the clustering statistic, is it valid to ignore them in 
the error estimate of the clustering signal? The point here is 
to understand whether excessive fluctuation due to a "sin- 
gle" region is better discarded in the error analysis. 

These two tests are minimal conditions to be satisfied for 
discarding outlying regions from the analysis, where outlying 
region is defined here by |5,jk| > 2.5, appropriate for a ~ 2-a 
cut for Naub=27. 

To answer test (a), the top panel of Fig. 
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displays the 

relative effect on the 2-point clustering signal of discarding 
the outlying regions (red) or using the ensemble of JK re- 
samlpings (black). In both cases the results are compared 
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to the clustering signal from the ensemble of N-body sim- 
ulations (Cre/(s), with error The errorbars show 
the error on the mean, which is the relevant quantity for the 
present analysis as we are interested in the mean trend and 
any deviations from its expected value (as shown by the dot- 
ted horizontal line). To avoid new estimates to be calculated, 
the red curve shows the results from averaging the clustering 
signal in all the outlying regions, selected from each simula- 
tion via the condition |<5jk! > 2.5. We point out that we had 
to ignore volumes with two or more outliers (~ 10 per cent 
of the volumes), as in those cases the clustering signal of an 
outlying JK region is not necessarily representative of the 
clustering of the volume with all outlying regions removed. 
The top panel of Fig. [12] clearly shows that the outlying 
JK regions have a systematically different clustering signal 
from the simulation ensemble, hence ignoring them in any 
clustering analysis would result in systematically low clus- 
tering amplitudes. We additionally note that there is scale 
dependence in this bias, changing from ~ 2 to ~ 10 per cent 
between 3 and 30 Mpc. 



To answer test (b), the bottom panel of Fig. 12 displays 



the percentage difference in the relative error on the 2-point 
clustering signal of discarding the outlying regions (blue) 
or using the ensemble of JK resamplings (black). In both 
cases the results are compared to the relative clustering er- 
ror from the ensemble of N-body simulations, i.e. o"J^'y:(s). 
As for the top panel, the errorbars indicate the error on the 
mean. Since test (a) showed that it is incorrect to ignore 
the outlying JK region for the estimate of ^(s) and to avoid 
calculating new clustering estimates, the blue curve shows 
in fact the results from averaging the relative error on ^(s) 
ignoring the contribution from all the outlying regions, se- 
lected from each simulation via the condition \Sjk\ > 2.5. 
Yet again, only simulation volumes with one single outlying 
region were considered, as only for these volumes can we re- 
late the excess clustering to one JK region (which does not 
require any reestimation of the clustering statistics). The 
bottom panel of Fig. [12] shows that, while the relative error 
from the ensemble of JK resamplings is within a few per 
cent of the relative error from the simulation ensemble, the 
relative error estimated from ignoring the outlying regions 
is underestimated by typically 15 to 20 per cent on scales 
above ~ 5/i~^Mpc. It is worth noting that the error on 
scales smaller than a few Mpc is, as already noted in 
Norberg et al. (2009), overestimated by a significant amount 
when using JK resamplings compared to the simulation en- 
semble. 

Hence using our series of N-body simulations we have 
shown in this section that not including the outlying JK 
region in the estimate of the clustering statistic and its errors 
significantly affects the results and hence should be avoided. 



5 IS THE SDSS M* SAMPLE COMPATIBLE 
WITH ACDM? 

Having identified an unusual overdensity of M* galaxies in 
the SDSS, it is natural to ask if such a structure is expected 
in the ACDM cosmology. We address this question using a 
suite of large volume N-body simulations by Angulo et al. 
(2008). The same calculations were used in the evaluation of 
internal error estimation schemes by Norberg et al. (2009). 



The L-BASICC simulation ensemble of Angulo et al. 
(2008) comprises 50 moderate resolution runs, each repre- 
senting the matter distribution using 448^ particles of mass 
1.85 X 10^^ h-^ Mq in a box of side 1340/i~^Mpc. Each L- 
BASICC simulation was evolved from a different realization 
of a Gaussian density field set up at z = 63, using the follow- 
ing cosmological parameters Qm = 0.25, SI a = 0.75, erg — 
0.9, n = 1, w = -1 and ft = //o/(100 kms"^Mpc"^) = 0.73. 
The combination of a large number of independent realiza- 
tions and the huge simulation volume make the L-BASICC 
ensemble ideal for searching for unusual structures (see also 
Yaryura, Baugh & Angulo 2010). 

Following the method set out in Norberg et al. (2009), 
we extract from each simulation cube two cubical sub- 
volumes of 380 Mpc on a side, which are separated by at 
least ~ 500 Mpc. These pairs of volumes are correlated 
because they come from the same simulation box. However, 
in practice, because of the wide separation of the subvolumes 
and the low amplitude of the power at these wavelengths, 
we can treat them as being independent. Hence, with this 
procedure we construct 100 mock catalogues each of which 
is fully independent of 98 of the others (as these come from 
different simulation boxes) and is essentially independent of 
the other subvolume taken from the same simulation. 

The redshift space distortion of clustering is not mod- 
elled using the distant observer approximation within the 
extracted region. Instead we place the observer at the cen- 
tre of the simulation volume and extract a region that 
is at a distance comparable to the SDSS M* sample, i.e. 
~ 300 Mpc away. The cubical region that we extract is 
somewhat bigger than the SDSS M* sample. This is to pro- 
vide a spatial buffer, as peculiar velocities can displace par- 
ticles in either direction along the line of sight, and particles 
can be moved into as well as out of the volume of interest. 
Finally, we randomly dilute the number of dark matter par- 
ticles in each data set to match the number density of the 
M* galaxy sample of 3.7 x 10~^ /i"^Mpc^, which mimics 
the discreteness or shot noise level in the SDSS M* volume 
limited sample. Following Norberg et al. (2009), we do not 
attempt to model a particular galaxy sample and survey ge- 
ometry in detail, but simply to cover a comparable volume 
with the same number density of objects. Our aim here is to 
make a generic comparison: are enough large structures ex- 
pected in the ACDM cosmology to account for the outliers 
we have seen in 1/25* of a volume limited M* SDSS galaxy 
sample? 

We repeat the Jackknife resampling analysis we carried 
out earlier for the SDSS data samples, but for the L-BASICC 
dark matter catalogues and present the 2-point clustering 
estimates in Fig. |13[ Each panel shows results for a different 
subvolume from the ensemble (in red with the subsample 
number indicated in the panel legend) and the corresponding 
result for the SDSS ref sample (in black and replicated in 
all panels). Each line corresponds to the fluctuation of ^(s) 
measured from one of the N^ub Jackknife resamplings around 
the measurement from the corresponding full volume limited 
sample. 

The L-BASICC regions we analyze have a volume equal 
to that of the bright-all sample, which is twice the size of 
the ref sample, so we expect the scatter in the Jackknife 
resamplings for the simulations to be smaller than for the 
ref sample. We found a similar result in Fig. [lO]with the 
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Figure 13. The JK resampling fluctuation for the two-point correlation function in nine (one per panel) randomly chosen L-BASICC 
ACDM simulations that are all statistically similar and from which a volume comparable to a M* SDSS sample has been extracted (red 
lines). The equivalent statistic for the SDSS data is shown using black lines and is reproduced in each panel. The thick black line shows 
the clustering measured on omitting zone 23. 



scatter from the bright-all sample being systematically 
smaller than either of its two components, i.e. bright-far 
and bright-far. 

Mainly for this reason and for slight mismatches be- 
tween the simulation and the SDSS data (see Norberg et al. 
2009 for a detailed discussion), the scatter measured in the 
simulation results should be considered as a lower limit com- 
pared to that displayed by the SDSS data in the ref sample, 
or better matched to the bright-all sample (based on sim- 
ple volume arguments). The main conclusion we can draw 



from Fig. [Ts] is that outliers comparable to or even more 
extreme than that produced by zone 23 in the data are 
reasonably common in the ACDM cosmology. In approxi- 
mately one third of the randomly chosen cases, the scatter in 
the simulation Jackknife resamplings is comparable to what 
is seen in our reference volume limited sample. Hence the 
structure in zone 23 does not present a problem for ACDM 
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Figure 14. Comparison of Q3{a) estimates from the 2dFGRS 
(blue continuous line with errorbars) and SDSS (open squares 
with errorbars). SDSS errorbars are from the A^^ub = 25 JK 
subsamples. 2dFGRS errors are from the dispersion of 22 galaxy 
mocks that match 2dFGRS 2-point correlations (see Gaztaiiaga 
et al. 2005 for details). The dashed blue line corresponds to the 
2dFGRS measurement after removing the 2 largest superclusters. 
The dotted line corresponds to result from the SDSS after leaving 
out JK subsample 23 (ie all minus outlier subregion 23 in Figjsjl. 
Red lines show the corresponding results from the 2dFGRS red 
selected sample, which ought to provide a bettor comparison with 
the r-band SDSS selected galaxies. Errorbars in the bottom panel 
show the relative JK rms fluctuations in SDSS (same as in Fig. |4] 
but scaled by usual JK factor y'-'Vsub = 5). The dotted line cor- 
responds to results after leaving out SDSS JK subsample 23. The 
blue dashed line shows the relative fluctuation for the 2dFGRS 
sample after excluding the superstructures. 



6 SUPERSTRUCTURES IN 2dFGRS & SDSS 

Previous studies which estimated the higher-order clustering 
statistics from the 2dFGRS (e.g. Baugh et al. 2004; Croton 
et ah 2004, 2007; Gaztanaga et al. 2005) and SDSS surveys 
(e.g. Nichol et al. 2006; Mcbride et al. 2011) found that the 
presence of one or two superstructures had a strong influence 
on the interpretation of the measurements for their M* sam- 
ples, which are all similar in their definitions to the one con- 
sidered here, covering roughly the same redshift range, i.e. 
0.05 ^ 2 0.11 and sampling mostly M* galaxies. In some 
work, clustering statistics were presented both for full sam- 
ples and for samples from which the volume containing the 
superstructure had been cut out, referred to as the "with" 
and "without" supercluster results. This practice was merely 
intended to illustrate the impact of these structures on the 
measurements, rather than to advocate one or the other as 
being the definitive or correct answer. An example of this 
practice is shown for the 2dFGRS in Fig. |14| which shows 
the "with" and "without" measurements of the 3-point func- 
tion for the M* volume limited sample, as first presented 
in Gaztafiaga et al. (2005). At that time, because of the 
substantial difference between these two measurements, the 



M* sample results were considered unrehable. Therefore the 
analysis and interpretation focused mostly on larger volume 
limited samples, corresponding to brighter galaxies. Due to 
a combination of increased volume and the way in which 
brighter galaxies sample or weight structures (compared to 
M* galaxies) these volume limited samples seemed to be less 
affected by any particular superstructure. 

There is one subtle, but important difference between 
these previous analyses and the one presented in this pa- 
per. Here an objective, blind approach is taken, while in the 
earlier works, a bespoke supercluster mask was constructed, 
starting from the location of high peaks in the 3-D den- 
sity field and then masking all galaxies within a radius of 
25/i~^Mpc from the centre of the peak (see Baugh et al. 
2004 and Croton et al. 2004 for details). This method ac- 
tually results in a more significant difference between the 
measurements "with" and "without" the superclusters. It is 
therefore natural to revisit these results and compare them 
with the more objective method considered here. 

In Fig. [14] we show a comparison between Q3 (a) esti- 
mated from the 2dFGRS and SDSS M* samples. The 2dF- 
GRS clustering estimates are from Gaztaiiaga et al. (2005), 
with two sets of measurements plotted: the with superclus- 
ters measurement (continuous lines) and the without super- 
clusters measurement (dashed lines). The SDSS DR7 mea- 
surements from the ref sample (squares with errorbars) are 
new and fall consistently between the results from the 2dF- 
GRS analysis. It is reassuring to see how clustering mea- 
surements for 3-point statistics are reproducible; they come 
from different parts of the sky, are obtained using differ- 
ent telescopes/cameras and, most importantly, are based on 
slightly different galaxy selections. To attempt to remove the 
difference in selection between the 2dFGRS and SDSS sam- 
ples, we also show the results for the 2dFGRS, both with 
and without superclusters, for a red selected subsample (see 
Gaztanaga et al. 2005 for the exact definition of their red 
selected 2dFGRS M* sample). This should be more com- 
parable to the selection in the SDSS, which is done in the 
r— band. The results are noisier because of the smaller num- 
ber of galaxies retained, but are similar to the measurements 
from the full, blue selected 2dFGRS sample. 
In the bottom panel of Fig. 
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the dashed (blue) line 
indicates the relative difference in Q3(a) (in percent) mea- 
sured from the 2dFGRS when we compare results with and 
without the superclusters. The effect seen in the 2dFGRS 
is much larger than the analogous result in Fig. [4] for the 
SDSS. The SDSS measurement is characterized by the er- 
rorbars in the bottom panel of Fig. |14[ scaled by the usual 
JK factor ^fN^^ = 5 to show the variance in the full sample 
(as opposed to the variance in the JK-subsample displayed 
in Fig. |4|. The shift in the measurement of from the 
SDSS on leaving out the outlier subregion 23 is shown by 
the dotted line. For the 2dFGRS, removing the supercluster 
from the clustering analysis produces a change of up to ~ 50 
per cent in Q3(a), while the corresponding number for the 
SDSS using the more objective method of JK resampling 
introduced here is ~ 12 per cent. These differences refiect 
both that the 2dFGRS is ~ 5 times smaller than SDSS and 
also the different ways in which outlying superstructures are 
defined in each case. Also note how the SDSS measurement 
on leaving out the outlier subregion is still within the JK 
errors while the 2dFGRS measurement without superclus- 
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ters is outside the errors (which came from an ensemble of 
22 mock catalogues). 

What have we learnt from this? From our new outlier 
analysis using Jackknife resampling and for the particular 
statistics (and scales) we consider, the impact of large scale 
inhomogeneities in the SDSS M* sample can be as large as 
~ 12% in Q3(a) on scales of 12 - 14/1"^ Mpc. We can and 
should use this information to test models, both in terms 
of interpreting a fit to the data and in terms of the outlier 
probability in realizations of a given model. These outlier 
probabilities are now significantly better defined than be- 
fore, as is clear from the bottom panel of Fig.|14|through the 
difference between the 2dFGRS estimate "with" and "with- 
out" superclusters and the statistical Jackknife resampling 
method. More importantly, as shown in the previous sec- 
tion, we can apply the same objective algorithm (i.e. the 
Jackknife resampling method) to both data and simulations 
enabling statistically robust conclusions to be drawn. The 
previous method applied to the 2dFGRS, based on remov- 
ing superstructures, is subjective and significantly harder to 
transfer from data to mocks. It is now clear from the com- 
parison between the 2dFGRS and SDSS results in Fig. [M] 
that the error bars derived from mock 2dFGRS catalogues 
did not capture the full variance in that sample. This can 
be better understood if one accounts for the various limita- 
tions of the mocks, i.e. (a) we have only 22 non-overlapping 
(but still not independent) mocks drawn from the same Hub- 
ble Volume simulation (Evrard et al. 2002); (b) the mocks 
are only constrained to reproduce the two-point 2dFGRS 
galaxy clustering (Cole et al. 1998); (c) errors on cluster- 
ing statistics depend directly on their higher order moments 
(e.g. Bernardeau et al. 2002). This was already noted by 
Gaztanaga et al. (2005) who considered this sample to be 
unreliable because of the effect of the superclusters. The sit- 
uation is different for the SDSS analysis presented here (in 
Fig. 14 1 where, despite the presence of the outlier region 23 
(i.e. see Fig[3|, we can be more confident of the error analy- 
sis because we have a systematic way to quantify the impact 
of outliers. 



T SUMMARY 

This paper presents a new method to quantify the robust- 
ness of error estimates for clustering statistics. The approach 
is an extension of the Jackknife technique and uses two new 
clustering diagnostics to quantify the distribution of cluster- 
ing measurements from different resamplings: the JK resam- 
pling fluctuation (first shown in Fig.[4|, and the JK ensemble 
fiuctuation (first shown in Fig.|6|. The main features of the 
method can be summarized as follows: 

• The technique provides an objective way of finding large 
coherent structures. As the Jackknife zones are set up a pri- 
ori, this is a blind test for the presences of superstructures, 
based on their impact on the measured clustering statistics. 
This is an improvement over earlier work in which "unusual" 
regions were omitted after making the clustering measure- 
ments, without any firm guidance as to what volume should 
be excluded. 

• Our approach provides a quantitative way of determin- 
ing the appropriate size of the Jackknife zones to be used 



in the error analysis of the clustering signal. It follows these 
three key ingredients: 

a. the size and number of JK regions should be such that 
there is no "apparent clustering" in the 5jk statistic of 
neighbouring zones (i.e. neighbouring zones should have 
5jK values which are independent from eachother). 

b. no JK region should present too extreme a (5jk statis- 
tic compared to the others, i.e. the associated probability 
of its occurence (as derived from the 5jk distribution as 
shown in Fig. [s] for ^(s) with three values of Nsub) should 
not be significantly at odds with the probability of such an 
event actually happening (which for large Nsub values can 
be modelled by a Gaussian process with Nsub elements). 

c. The probability threshold recommended by the present 
work is ~ 3% (i.e. about 2-a if Gaussian distributed). This 
corresponds to (5jk = —2.5 for ^(s) with Nsub=27, and 
<5jK — —2 for larger Nsub values (according to Fig. [5]|. 

• The new statistics we have introduced can be used to 
compare observations to models. As shown in !|5j we can 
check in a straight forward and objective way whether or 
not simulations produce similar outlying clustering measure- 
ments as those seen in the data. Again, this is a significant 
improvement over the ad-hoc approach of computing the 
correlation functions "with" and "without" the superstruc- 
trures. 

• The statistics offer a quantitative way to study the in- 
fluence of large coherent structures on the measured correla- 
tion function, even if the structures dominate the clustering 
signal. As shown in i|6] with these new tools we can quan- 
tify how inhomogeneities affect our measurements and the 
reliability of our error estimates. 

In the standard application of the Jackknife method to 
galaxy surveys, the dataset is divided spatially into zones. 
Our procedure overcomes one of the long standing problems 
of Jackknife error estimation: how many zones should the 
data be split up into? A large number of zones implies a 
large number of resamplings of the dataset. To improve the 
stability of the inversion of the covariance martix, it is im- 
portant to have as large as possible a number of resamplings 
(e.g. Hartlap et al. 2007). This is a vague concept, as if the 
Jackknife zones are made too small, the resamplings will be 
essentially the same, with a very small change in the volume 
covered. Our approach uses the Jackknife ensemble fluctu- 
ation to determine the appropriate number of zones. This 
statistic allows us to identify the omission of a zone as lead- 
ing to an outlying clustering measurement. We have argued 
that such outliers should be due to the structure contained 
within one Jackknife zone, rather than several contiguous 
zones. In our illustration using the M* SDSS volume limited 
sample, this suggested that Nsub=25 is a more robust num- 
ber of zones to use in the Jackknife method than Nsub=225. 

The approach we have set out in this paper, along with 
the comparison between different error estimation methods 
presented in Norberg et al. (2009), provides a robust and 
objective presciption for the estimation of galaxy correlation 
functions and their associated errors, which can be applied 
to any type of forthcoming survey, both spectroscopic and 
photometric. 
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APPENDIX A: SQL QUERIES 

In this appendix we list the SQL queries and data files used 
to generate the galaxy catalogues and to construct the imag- 
ing and spectroscopic completeness masks used in the paper. 



Al Galaxy Catalogue 

The GasJobs SQL query used to generate the input galaxy 
catalogue from which the spectroscopic galaxy catalogue is 
derived is: 

SELECT . . . 
INTO . . . 

FROM PhotoPrimary po 

LEFT OUTER JOIN Target t on t.bestObjID = po.objid 
LEFT OUTER JOIN SpecObj so on po.specObjID = 
so . specObjID 

where po .primtarget&(64 1 128 I 256) ! =0 and 
po.status&(16384) ! =0 



This query accounts for galaxies with and without redshifts, 
but which were intended for spectroscopic targeting. Only 
galaxies whose redshift satisfies the GAMA (Driver et al. 
2009, 2010) selection criteria for a good redshift (see Baldry 
et al. 2010 for the exact definition) are used in the clustering 
analysis. 



A2 Imaging and Spectroscopic Mask 

To generate the imaging and spectroscopic masks, the 
following GasJobs SQL query can be used : 

SELECT 

run.f ield,raMin,raMax,decMin,decMax, 
rerun, camcol , skyVersion 
FROM field 



together with these two DR7 files: 



]http : //www . sdss 


org/dr7/coverage/tsChunk. dr7 .best .pcir| 




|http : //www . sdss 


org/ dr7/ coverage/maindr72spectro . par| 



listing the imaging coverage and the position of the spectro- 
scopic tiles respectively. It is also necessary to include the 
information about additional "holes" in SDSS DR7 cover- 
age, Eis listed on 

http : //www. sdss . org/dr7/ coverage/holes .html . 

The masks, quilts and galaxy catalogues can be made avail- 
able upon request by contacting lead author. 
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